23 research outputs found

    A Comparative Analysis of Data Mining Techniques on Breast Cancer Diagnosis Data using WEKA Toolbox

    Get PDF
    Abstract—Breast cancer is considered the second most common cancer in women compared to all other cancers. It is fatal in less than half of all cases and is the main cause of mortality in women. It accounts for 16% of all cancer mortalities worldwide. Early diagnosis of breast cancer increases the chance of recovery. Data mining techniques can be utilized in the early diagnosis of breast cancer. In this paper, an academic experimental breast cancer dataset is used to perform a data mining practical experiment using the Waikato Environment for Knowledge Analysis (WEKA) tool. The WEKA Java application represents a rich resource for conducting performance metrics during the execution of experiments. Pre-processing and feature extraction are used to optimize the data. The classification process used in this study was summarized through thirteen experiments. Additionally, 10 experiments using various different classification algorithms were conducted. The introduced algorithms were: Naïve Bayes, Logistic Regression, Lazy IBK (Instance-Bases learning with parameter K), Lazy Kstar, Lazy Locally Weighted Learner, Rules ZeroR, Decision Stump, Decision Trees J48, Random Forest and Random Trees. The process of producing a predictive model was automated with the use of classification accuracy. Further, several experiments on classification of Wisconsin Diagnostic Breast Cancer and Wisconsin Breast Cancer, were conducted to compare the success rates of the different methods. Results conclude that Lazy IBK classifier k-NN can achieve 98% accuracy among other classifiers. The main advantages of the study were the compactness of using 13 different data mining models and 10 different performance measurements, and plotting figures of classifications errors

    GFLIB: an Open Source Library for Genetic Folding Solving Optimization Problems

    Get PDF
    This paper aims at presenting GFLIB, a Genetic Folding MATLAB toolbox for supervised learning problems. In essence, the goal of GFLIB is to build a concise model of supervised learning, and a free open source MATLAB toolbox for performing classification and regression. The GFLIB is specifically designed for most of the traditionally used features, to evolve in applications of mathematical models. The toolbox suits all kinds of users; from the users who implemented GFLIB as “black box”, to advanced researchers who want to generate and test new functionalities and parameters of GF algorithm. The toolbox and its documentation are freely available for download at: https://github.com/mohabedalgani/gflib.git

    Enhanced Artificial Intelligence System for Diagnosing and Predicting Breast Cancer Using Deep Learning

    Get PDF
    Breast cancer is the leading cause of death among women with cancer. Computer-aided diagnosis is an efficient method for assisting medical experts in early diagnosis, improving the chance of recovery. Employing artificial intelligence (AI) in the medical area is very crucial due to the sensitivity of this field. This means that the low accuracy of the classification methods used for cancer detection is a critical issue. This problem is accentuated when it comes to blurry mammogram images. In this paper, convolutional neural networks (CNNs) are employed to present the traditional convolutional neural network (TCNN) and supported convolutional neural network (SCNN) approaches. The TCNN and SCNN approaches contribute by overcoming the shift and scaling problems included in blurry mammogram images. In addition, the flipped rotation-based approach (FRbA) is proposed to enhance the accuracy of the prediction process (classification of the type of cancerous mass) by taking into account the different directions of the cancerous mass to extract effective features to form the map of the tumour. The proposed approaches are implemented on the MIAS medical dataset using 200 mammogram breast images. Compared to similar approaches based on KNN and RF, the proposed approaches show better performance in terms of accuracy, sensitivity, spasticity, precision, recall, time of performance, and quality of image metrics

    A Descriptive Analysis of Job Satisfaction among Faculty Members: Case of Private Vocational and Technical Education Institutions, Baabda, Mount Lebanon, Lebanon

    Get PDF
    The study aimed to assess Job Satisfaction (JS) among teaching staff in private vocational and technical education institutions in Mount Lebanon, Lebanon. The study is descriptive and analytic, using a sample of 200 teachers from 13 schools and institutes chosen according to the coordinated random method. A questionnaire created and validated by Warr, Cook, and Wall is adopted to measure job satisfaction. This questionnaire includes personal information like job status, educational level, number of years of work, monthly income, age, gender, and social status. Using a seven-level Likert scale, it also contains 15 items to measure various dimensions of job satisfaction (internal and external factors). Results show a low overall mean of 4.69 out of 7 with a standard deviation of 1.15 for job satisfaction, based on data analysis using the SPSS program. Also, the majority of respondents are not satisfied with the wage received (the overall mean of job satisfaction=3.83, with a standard deviation of 2.00); there is a low level of JS among teachers concerning the degree of job security (mean=4.13 with a standard deviation of 1.91); there are no statistically significant differences in JS among teachers due to demographics. Capitalizing on the results, recommendations are made

    Genetic folding for solving multiclass SVM problems

    Get PDF
    Genetic Folding (GF) algorithm is a new class of evolutionary algorithms specialized for complicated computer problems. GF algorithm uses a linear sequence of numbers of genes structurally organized in integer numbers, separated with dots. The encoded chromosomes in the population are evaluated using a fitness function. The fittest chromosome survives and is subjected to modification by genetic operators. The creation of these encoded chromosomes, with the fitness functions and the genetic operators, allows the algorithm to perform with high efficiency in the genetic folding life cycle. Multi-classification problems have been chosen to illustrate the power and versatility of GF. In classification problems, the kernel function is important to construct binary and multi classifier for support vector machines. Different types of standard kernel functions have been compared with our proposed algorithm. Promising results have been shown in comparison to other published works

    Genetic Folding (GF) Algorithm with Minimal Kernel Operators to Predict Stroke Patients

    No full text
    A stroke is a medical disorder in which blood arteries in the brain rupture, causing brain damage. Symptoms may appear when the brain’s blood supply and other nutrients are cut off. According to the World Health Organization, Stroke is the leading cause of death and disability globally. Early recognition of the multiple warning signs of a stroke helps reduce the severity of the stroke. The paper presents a modified version of the Genetic Folding algorithm to predict stroke based on symptoms. Considerable Machine Learning models, including Logistic Regression, Decision Tree, Random Forest, Naïve Bayes, Support Vector Machine, and the proposed Minimal Genetic Folding, were compared to forecast the probability of having a stroke in the brain using a variety of physiological characteristics. The proposed minimal Genetic Folding approach has been developed using the open-access Stroke Prediction dataset using minimal kernel operators. The datasets generated and/or analyzed during the current study are available in the Kaggle repository. With an accuracy of 83.2%, the proposed minimal Genetic Folding approach outperformed Logistic Regression by 4.2%, Naïve Bayes by 1.2%, Decision Tree by 17.2%, and Support Vector Machine by 83.2%. The area under the curve of the proposed model is much more significant than earlier research by 7%, demonstrating that this model is more dependable and was the top-performing algorithm

    Evaluation of CO2 Purification Requirements and the Selection of Processes for Impurities Deep Removal from the CO2 Product Stream

    No full text
    AbstractDepending on the reference power plant, the type of fuel and the capture method used, the CO2 product stream contains several impurities which may have a negative impact on pipeline transportation, geological storage and/or Enhanced Oil Recovery (EOR) applications. All negative impacts require setting stringent quality standards for each application and purifying the CO2 stream prior to exposing it to any of these applications.In this paper, the CO2 stream specifications and impurities from the conventional post-combustion capture technology are assessed. Furthermore, the CO2 restricted purification requirements for pipeline transportation, EOR and geological storage are evaluated. Upon the comparison of the levels of impurities present in the CO2 stream and their restricted targets, it was found that the two major impurities which entail deep removal, due to operational concerns, are oxygen and water from 300 ppmv to 10 ppmv and 7.3% to 50 ppmv respectively. Moreover, a list of plausible technologies for oxygen and water removal is explored after which the selection of the most promising technologies is made. It was found that catalytic oxidation of hydrogen and refrigeration and condensation are the most promising technologies for oxygen and water removal respectively

    The Effect of COVID-19 on the Performance of SMEs in Emerging Markets in Iran, Iraq and Jordan

    No full text
    his research aims to investigate the effect of COVID-19 on the performance of small and medium enterprises (SMEs) in emerging markets in Iran, Iraq and Jordan. In order to collect the required data, a standard questionnaire provided in the literature was used. The research period is the second quarter of 2022, and its population includes managers, accountants and auditors engaged in listed and non-listed companies. The research findings indicate that the outbreak of COVID-19 has affected SMEs’ performance in investigated emerging markets. For the first time, this research has examined the impact of COVID-19 on the performance of SMEs in emerging markets. The research was conducted in the three countries of Iran, Iraq and Jordan, which have different environmental conditions indicating the impact of contextual factors on the effects of the spread of COVID-19. The results can be useful for different parties, such as SMEs’ owners and regulatory bodies in similar markets

    Evaluation of Handling and Reuse Approaches for the Waste Generated from MEA-based CO<sub>2</sub> Capture with the Consideration of Regulations in the UAE

    No full text
    A waste slip-stream is generated from the reclaiming process of monoethanolamine (MEA) based Post-Combustion Capture (PCC). It mainly consists of MEA itself, ammonium, heat-stable salts (HSS), carbamate polymers, and water. In this study, the waste quantity and nature are characterized for Fluor’s Econamine FGSM coal-fired CO<sub>2</sub> capture base case. Waste management options, including reuse, recycling, treatment, and disposal, are investigated due to the need for a more environmentally sound handling. Regulations, economic potential, and associated costs are also evaluated. The technical, economic, and regulation assessment suggests waste reuse for NO<sub><i>x</i></sub> scrubbing. Moreover, a high thermal condition is deemed as an effective technique for waste destruction, leading to considerations of waste recycling into a coal burner or incineration. As a means of treatment, three secondary-biological processes covering Complete-Mix Activated Sludge (CMAS), oxidation ditch, and trickling filter are designed to meet the wastewater standards in the United Arab Emirates (UAE). From the economic point of view, the value of waste as a NO<sub><i>x</i></sub> scrubbing agent is 6 561 600–7 348 992 USD/year. The secondary-biological treatment cost is 0.017–0.02 USD/ton of CO<sub>2</sub>, while the cost of an on-site incinerator is 0.031 USD/ton of CO<sub>2</sub> captured. In conclusion, secondary biological treatment is found to be the most economical option

    Assessment of the specificity and stability of Micro-RNAs as a forensic gene marker

    No full text
    Background: Forensic investigations depend on bodily fluid analysis to identify the perpetrators. Identifying perpetrators requires knowledge about suspects' body fluids. Due to their durability and tissue-specific expression patterns, miRNAs may be forensic indicators. However, miRNA expression patterns in various bodily fluids are seldom compared. This study examined miR-372, miR-135p, miR-124-3p, miR-16, and miR-10b expression in seminal fluids, blood stains, and vaginal secretions using quantitative PCR using SNORD-47 as a reference gene. This research compared miRNA expression levels in diverse body fluids to assess their potential as forensic biomarkers. MicroRNAs were isolated from forensic blood, seminal fluids, and vaginal mixed stains. Methods: Quantitative PCR measured miR-372, miR-135p, miR-124-3p, miR-16, and miR-10b gene expression. Normalization utilized SNORD-47. These miRNAs were compared in various bodily fluids. Results: The analysis of the results revealed that three bodily fluids have unique miRNA expression patterns. Seminal fluids expressed considerably more miR-135b and miR-10b than vaginal secretions. Vaginal fluids expressed more miR-372 and miR-124-3p than seminal fluids. Blood fluids expressed more miR-126 and miR-16 than seminal and vaginal fluids. Conclusion: MiR-126, miR-16, miR-372, and miR-124-3p were considerably more significant than SNORD-47 in blood, vaginal secretions, and seminal fluids
    corecore